Model Selection

GGUF Efficient Inference

# GGUF Efficient Inference

3b Zh Ft Research Release Q4 K M GGUF

This is a Chinese language model converted from canopylabs/3b-zh-ft-research_release to GGUF format, suitable for text generation tasks.

Large Language Model Chinese

3b De Ft Research Release Q4 K M GGUF

This is a GGUF-format German language model converted from canopylabs/3b-de-ft-research_release, suitable for text generation tasks.

Large Language Model German

3b Es It Ft Research Release Q4 K M GGUF

This is a GGUF format model converted from canopylabs/3b-es_it-ft-research_release, supporting Spanish and Italian.

Large Language Model Supports Multiple Languages

Qwen2.5 VL 72B Instruct GGUF

Qwen2.5-VL-72B-Instruct is a 72B-parameter multimodal large model that supports vision-language tasks, capable of understanding and generating text related to images.

Text-to-Image English

Bge Reranker V2 M3 Q4 K M GGUF

This model is a GGUF format conversion of BAAI/bge-reranker-v2-m3, designed for text ranking tasks with multilingual support.

Text Embedding Other

Llama 3.1 Nemotron Nano 8B V1 GGUF

An 8B-parameter open-source large language model released by NVIDIA, based on the Llama-3 architecture, offering multiple quantization versions

Large Language Model English

Gemma 3 12b It Q6 K GGUF

This is the GGUF quantized version of Google's Gemma 3B model, suitable for local deployment and inference.

Large Language Model

Mistral 7B Business F16 GGUF

This is a business domain-adapted model based on Mistral-7B, converted to GGUF format for use in llama.cpp.

Large Language Model English

rafaelldietrich

Teuken 7B Instruct Research V0.4 Q6 K GGUF

This model is a GGUF format conversion based on openGPT-X/Teuken-7B-instruct-research-v0.4, supporting multilingual text generation tasks.

Large Language Model Supports Multiple Languages

Noticia 7B GGUF

NoticIA-7B is a Spanish-language news summarization model, specialized in processing news content and generating summaries.

Large Language Model Spanish

Bge Reranker Large Q4 K M GGUF

This model is converted from BAAI/bge-reranker-large into GGUF format for reranking tasks, supporting both Chinese and English.

Text Embedding Supports Multiple Languages

Bge Reranker V2 M3 Q4 K M GGUF

This model is converted from BAAI/bge-reranker-v2-m3 into GGUF format for text reranking tasks, supporting multiple languages.

Text Embedding Other

Gguf Sharded LaMini Flan T5 248M

This is a GGUF format model converted from MBZUAI/LaMini-Flan-T5-248M, suitable for text generation tasks.

Large Language Model English

Llava Llama 3 8b V1 1 Q3 K S GGUF

This model is a GGUF format conversion based on xtuner/llava-llama-3-8b-v1_1, supporting multimodal processing of images and text.

Wizardlm 2 8x22B GGUF

WizardLM-2-8x22B-GGUF is the GGUF quantized version of Microsoft's WizardLM-2-8x22B model, supporting multiple bit quantizations, suitable for text generation tasks.

Large Language Model

Taiwan LLM 13B V2.0 Chat GGUF

A large language model based on LLaMa2-13b, supporting Traditional Chinese, in GGUF format

Large Language Model Chinese

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase